The GlottHMM Entry for Blizzard Challenge 2011: Utilizing Source Unit Selection in HMM-Based Speech Synthesis for Improved Excitation Generation
نویسندگان
چکیده
This paper describes the GlottHMM speech synthesis system for Blizzard Challenge 2011. GlottHMM is a hidden Markov model (HMM) based speech synthesis system that utilizes glottal inverse filtering for separating the vocal tract and the glottal source from speech signal and models both components individually. In this year’s entry, stabilized weighted linear prediction (SWLP) is used to yield more robust estimates of the vocal tract filter of the high-pitched female voice. After the inverse filtering, the resulting source signal is parameterized into excitation features and a glottal flow pulse library, consisting of the variety of different glottal flow pulses. In the synthesis stage, a unit selection scheme is used for reconstructing the source signal: by minimizing the target and concatenation costs, best matching glottal flow pulses are selected from the pulse library in order to create a natural voice source. Finally, speech is synthesized by filtering the excitation signal by the vocal tract filter.
منابع مشابه
The GlottHMM Speech Synthesis Entry for Blizzard Challenge 2010
This paper describes the GlottHMM speech synthesis entry for Blizzard Challenge 2010. GlottHMM is a hidden Markov model (HMM) based speech synthesis system that utilizes glottal inverse filtering for separating the vocal tract from the glottal source. The source and the filter characteristics are modeled separately in the framework of HMM. In the synthesis stage, natural glottal flow pulses are...
متن کاملThe GlottHMM Entry for Blizzard Challenge 2012: Hybrid Approach
This paper describes the GlottHMM speech synthesis system for Blizzard Challenge 2012. The aim of the GlottHMM system is to combine high-quality vocoding and detailed prosody modeling in order to produce expressive, high quality, synthetic speech. GlottHMM is based on statistical parametric speech synthesis, but it uses a glottal flow pulse library for generating the excitation signal. Thus, it...
متن کاملNICT Blizzard Challenge 2010 Entry
This paper details a speech synthesis system developed at NICT for the Blizzard Challenge 2010. The system depends on an HMM-based speech synthesis technique that possesses two distinctive features: HMM training under global-variance constraint on the parameter trajectory and trainable mixed excitation for source-filter vocoding. For this year’s entry, we added some modifications to the system ...
متن کاملThe NTNU Concatenative Speech Synthesizer
This paper describes NTNU’s entry for the Blizzard Challenge 2010. Our system is a conceptually simple variation of an HMM-based unit selection system, which uses diphones as the basic unit and employs a combined selection of units and their join points. The evaluation results of the Blizzard Challenge 2010 show that the system performs well when compared with the other systems.
متن کاملThe USTC System for Blizzard Challenge 2014
This paper introduces the speech synthesis system developed by USTC for Blizzard Challenge 2014. Six Indian languages were evaluated this year, including Assamese, Gujarati, Hindi, Rajasthani, Tamil and Telugu. Two tasks were built for these languages: the mono-lingual task (IH1 hub task) and the multi-lingual task (IH2 spoken task). We submitted entries to both tasks in all languages. We submi...
متن کامل